STAVICTA Group Report for RepLab 2014 Reputation Dimension Task
نویسندگان
چکیده
In this paper we present our experiments on the RepLab 2014 Reputation Dimension task. RepLab is a competitive challenge for Reputation Management Systems. RepLab 2014’s reputation dimensions task focuses on categorization of Twitter messages with regard to standard reputation dimensions (such as performance, leadership, or innovation). Our approach only relies on the textual content of tweets and ignores both metadata and the content of URLs within tweets. We carried out several experiments focusing on different feature sets including bag of n-grams, distributional semantics features, and deep neural network representations. The results show that bag of bigram features with minimum frequency thresholding work quite well in reputation dimension task especially with regards to average F1 measure over all dimensions where two of our four submitted runs achieve highest and second highest scores. Our experiments also show that semi-supervised recursive autoencoders outperform other feature sets used in our experiments with regards to accuracy measure and is a promising subject of future research for improvements.
منابع مشابه
The STAVICTA Group Report for RepLab 2014 Reputation Dimensions Task
In this paper we present our experiments on the RepLab 2014 Reputation Dimension task. RepLab is a competitive challenge for Reputation Management Systems. RepLab 2014’s reputation dimensions task focuses on categorization of Twitter messages with regard to standard reputation dimensions (such as performance, leadership, or innovation). Our approach only relies on the textual content of tweets ...
متن کاملUniversity of Glasgow Terrier Team / Project Abacá at RepLab 2014: Reputation Dimensions Task
This paper describes our participation in the RepLab 2014 Reputation Dimensions task. The task is a multi-class classification task where tweets relating to an entity of interest are to be classified by their reputation dimension. For our participation we investigate two approaches; Firstly, we use a term’s gini-index score to quantify the term’s representativeness of a specific class and const...
متن کاملFeature Selection and Data Sampling Methods for Learning Reputation Dimensions
We report on our participation in the reputation dimension task of the CLEF RepLab 2014 evaluation initiative, i.e., to classify social media updates into eight predefined categories. We address the task by using corpus-based methods to extract textual features from the labeled training data to train two classifiers in a supervised way. We explore three sampling strategies for selecting trainin...
متن کاملTweet Enrichment for Effective Dimensions Classification in Online Reputation Management
Online Reputation Management (ORM) is concerned with the monitoring of public opinions on social media for entities such as commercial organisations. In particular, we investigate the task of reputation dimension classification, which aims to classify tweets that mention a business entity into different dimensions (e.g. “financial performance” or “products and services”). However, producing a g...
متن کاملCIRGIRGDISCO at RepLab2014 Reputation Dimension Task: Using Wikipedia Graph Structure for Classifying the Reputation Dimension of a Tweet
Social media repositories serve as a significant source of evidence when extracting information related to the reputation of a particular entity (e.g., a particular politician, singer or company). Reputation management experts manually mine the social media repositories (in particular Twitter) for monitoring the reputation of a particular entity. Recently, the online reputation management evalu...
متن کامل